联合学习(FL)是一个分散的学习范式,其中多个客户在不集中其本地数据的情况下进行培训深度学习模型,因此保留数据隐私。现实世界中的应用程序通常涉及在不同客户端的数据集上进行分发转换,这损害了客户从各自的数据分布中看不见样本的概括能力。在这项工作中,我们解决了最近提出的功能转移问题,其中客户具有不同的功能分布,而标签分布相同。我们建议联邦代表性扩大(FRAUG)来解决这个实用且具有挑战性的问题。我们的方法在嵌入空间中生成合成客户端特定的样本,以增加通常小客户端数据集。为此,我们训练一个共享的生成模型,以融合客户从其不同功能分布中学习的知识。该发电机合成了客户端 - 不合时式嵌入,然后通过表示转换网络(RTNET)将其局部转换为特定于客户端的嵌入。通过将知识转移到客户端,生成的嵌入式作为客户模型的正常化程序,并减少对本地原始数据集的过度拟合,从而改善了概括。我们对公共基准和现实医学数据集的经验评估证明了该方法的有效性,该方法在包括Partialfed和FedBN在内的非IID特征的当前最新FL方法大大优于最新的FL方法。
translated by 谷歌翻译
Producing high-quality forecasts of key climate variables such as temperature and precipitation on subseasonal time scales has long been a gap in operational forecasting. Recent studies have shown promising results using machine learning (ML) models to advance subseasonal forecasting (SSF), but several open questions remain. First, several past approaches use the average of an ensemble of physics-based forecasts as an input feature of these models. However, ensemble forecasts contain information that can aid prediction beyond only the ensemble mean. Second, past methods have focused on average performance, whereas forecasts of extreme events are far more important for planning and mitigation purposes. Third, climate forecasts correspond to a spatially-varying collection of forecasts, and different methods account for spatial variability in the response differently. Trade-offs between different approaches may be mitigated with model stacking. This paper describes the application of a variety of ML methods used to predict monthly average precipitation and two meter temperature using physics-based predictions (ensemble forecasts) and observational data such as relative humidity, pressure at sea level, or geopotential height, two weeks in advance for the whole continental United States. Regression, quantile regression, and tercile classification tasks using linear models, random forests, convolutional neural networks, and stacked models are considered. The proposed models outperform common baselines such as historical averages (or quantiles) and ensemble averages (or quantiles). This paper further includes an investigation of feature importance, trade-offs between using the full ensemble or only the ensemble average, and different modes of accounting for spatial variability.
translated by 谷歌翻译
专门针对联合学习(SDAGFL)的定向无环图(SDAGFL)是一个新的联合学习框架,它通过有向的无循环图分布式分类帐技术(DAG-DLT)从设备中更新模型。 SDAGFL具有个性化的优势,可抵抗完全分散的联邦学习中的单点失败和中毒攻击。由于这些优点,SDAGFL适用于在设备通常由电池供电的物联网场景中进行联合学习。为了促进SDAGFL在物联网中的应用,我们提出了一个基于ESDAGFL的基于事件触发的通信机制的能量优化的。在ESDAGFL中,仅当新模型发生显着更改时才会广播。我们在莎士比亚和歌德的作品中从群集的合成女性数据集中评估了eSDAGFL的合成女性数据集和数据集。实验结果表明,与SDAGFL相比,我们的方法可以将能源消耗降低33 \%,并在训练准确性和专业化之间达到与SDAGFL相同的平衡。
translated by 谷歌翻译
几乎没有射击的内在学习(ICL)使预训练的语言模型能够通过为输入的一部分提供少量的培训示例来执行以前的任务,而无需任何基于梯度的培训。 ICL会产生大量的计算,内存和存储成本,因为它每次进行预测时都涉及处理所有培训示例。参数有效的微调(PEFT)(例如,适配器模块,提示调谐,稀疏更新方法等)提供了替代范式,其中训练了一组少量参数以启用模型来执行新任务。在本文中,我们严格地比较了几个ICL和PEFT,并证明后者提供了更好的准确性,并大大降低了计算成本。在此过程中,我们引入了一种称为(IA)$^3 $的新PEFT方法,该方法通过学习的向量来扩展激活,从而获得更强的性能,同时仅引入相对少量的新参数。我们还提出了一个基于称为T-FEW的T0模型的简单食谱,可以将其应用于新任务,而无需特定于任务的调整或修改。我们通过将T-FEW应用于木筏基准,首次实现超人性能,并以6%的绝对性能优于最先进的方法来验证T-FEW对完全看不见的任务的有效性。我们实验中使用的所有代码均可公开使用。
translated by 谷歌翻译
Deep learning models can achieve high accuracy when trained on large amounts of labeled data. However, real-world scenarios often involve several challenges: Training data may become available in installments, may originate from multiple different domains, and may not contain labels for training. Certain settings, for instance medical applications, often involve further restrictions that prohibit retention of previously seen data due to privacy regulations. In this work, to address such challenges, we study unsupervised segmentation in continual learning scenarios that involve domain shift. To that end, we introduce GarDA (Generative Appearance Replay for continual Domain Adaptation), a generative-replay based approach that can adapt a segmentation model sequentially to new domains with unlabeled data. In contrast to single-step unsupervised domain adaptation (UDA), continual adaptation to a sequence of domains enables leveraging and consolidation of information from multiple domains. Unlike previous approaches in incremental UDA, our method does not require access to previously seen data, making it applicable in many practical scenarios. We evaluate GarDA on two datasets with different organs and modalities, where it substantially outperforms existing techniques.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译
Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.
translated by 谷歌翻译
We propose a distributionally robust return-risk model for Markov decision processes (MDPs) under risk and reward ambiguity. The proposed model optimizes the weighted average of mean and percentile performances, and it covers the distributionally robust MDPs and the distributionally robust chance-constrained MDPs (both under reward ambiguity) as special cases. By considering that the unknown reward distribution lies in a Wasserstein ambiguity set, we derive the tractable reformulation for our model. In particular, we show that that the return-risk model can also account for risk from uncertain transition kernel when one only seeks deterministic policies, and that a distributionally robust MDP under the percentile criterion can be reformulated as its nominal counterpart at an adjusted risk level. A scalable first-order algorithm is designed to solve large-scale problems, and we demonstrate the advantages of our proposed model and algorithm through numerical experiments.
translated by 谷歌翻译
Witnessing the impressive achievements of pre-training techniques on large-scale data in the field of computer vision and natural language processing, we wonder whether this idea could be adapted in a grab-and-go spirit, and mitigate the sample inefficiency problem for visuomotor driving. Given the highly dynamic and variant nature of the input, the visuomotor driving task inherently lacks view and translation invariance, and the visual input contains massive irrelevant information for decision making, resulting in predominant pre-training approaches from general vision less suitable for the autonomous driving task. To this end, we propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving. We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos. The proposed PPGeo is performed in two stages to support effective self-supervised training. In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input. In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only. As such, the pre-trained visual encoder is equipped with rich driving policy related representations and thereby competent for multiple visuomotor driving tasks. Extensive experiments covering a wide span of challenging scenarios have demonstrated the superiority of our proposed approach, where improvements range from 2% to even over 100% with very limited data. Code and models will be available at https://github.com/OpenDriveLab/PPGeo.
translated by 谷歌翻译